Evaluating Probabilistic Matrix Factorization on Netflix Dataset
ثبت نشده
چکیده
Collaborative Filtering attempts to make automatic taste recommendations by examing a large number of taste information. Methods for achieving Collaborative Filtering can be broadly categorized into model based, and memory based techniques. In this project, we review and implement three variants of Probabilistic Matrix Factorization, a model based Collaborative Filtering algorithm. We compare the performance of Probabilistic Matrix Factorization to a memory based Collaborative Filtering algorithm, the K nearest neighbor algorithm. The model performance is compared using three datasets derived from the full NetFlix movie dataset, each with varying data sparsity. Specifically, the root mean square error measure (RMSE) is used as the metric to compare the relative performance between the different CF algorithms. In our performance evaluation, we discovered when the data sparsity is sufficiently low, Probabilistic Matrix Factorization out performs K nearest neighbor, achieving RMSE of as low as 0.83. However, when the data sparsity is high, KNN yields marginally better performance, obtaining RMSE score of 1.0453 on the most sparse dataset, while the regular Probabilistic Matrix Factorization scored a 1.0771 on the same dataset
منابع مشابه
Clustering-Based Matrix Factorization (under review)
Recommender systems are emerging technologies that nowadays can be found in many applications such as Amazon, Netflix, and so on. These systems help users find relevant information, recommendations, and their preferred items. Matrix Factorization is a popular method in Recommendation Systems showing promising results in accuracy and complexity. In this paper we propose an extension of matrix fa...
متن کاملA Survey on Predicting User Service Rating in Social Network Using Data Mining Methods
Recommender system plays an important role in our daily life. Recommender systems naturally intimate items to users that might be fascinating for them. We propose probabilistic matrix factorization technique for recommendations. The (PMF) model is proposed which computes continuously with the number of investigations and, more specially, functions well on the massive, inadequate, and very uneve...
متن کاملBayesian Factorization Machines
This work presents simple and fast structured Bayesian learning for matrix and tensor factorization models. An unblocked Gibbs sampler is proposed for factorization machines (FM) which are a general class of latent variable models subsuming matrix, tensor and many other factorization models. We empirically show on the large Netflix challenge dataset that Bayesian FM are fast, scalable and more ...
متن کاملProbabilistic Matrix Factorization
Many existing approaches to collaborative filtering can neither handle very large datasets nor easily deal with users who have very few ratings. In this paper we present the Probabilistic Matrix Factorization (PMF) model which scales linearly with the number of observations and, more importantly, performs well on the large, sparse, and very imbalanced Netflix dataset. We further extend the PMF ...
متن کاملMaximum Margin Matrix Factorization with Netflix Data
Maximum Margin Matrix Factorization (MMMF), a collaborative filtering method, was recently introduced in [7] followed by an iterative solution presented in [6]. In this paper we analyze the performance of MMMF on a subset of the Netflix data based on RMSE and classification rate. We also present several modifications to improve the performance of the algorithm on the Netflix problem.
متن کامل